Introduction
In my vectorization using .NET APIs blog, I describe SIMD datatypes Vector64<T>
and Vector128<T>
that operates on ‘Arm64 hardware intrinsic’ APIs present under System.Runtime.Intrinsics.Arm.AdvSimd and System.Runtime.Intrinsics.Arm.AdvSimd.Arm64 class. In this post I will describe those hardware intrinsic APIs by showing sample code usage along with examples and generated Arm64 code. This will help people in understanding these APIs so they can use them to optimize their .NET code written to target Arm64. Since there are 360 APIs, describing all of them in a single post will be overwhelming. So I have divided these APIs among 8 blogs and will demonstrate 45 APIs in each blog. This is part 2 of that blog series. You can checkout my previous blogs at Part 1.
Most of the description of these APIs is adapted and referenced from Arm Architecture Reference Manual Armv8, for Armv8-A architecture profile document. You can also refer to the description of SIMD and Floating-point instructions description at Arm developer docs page.
The blog page is programmatically generated and might contain mistakes. If you find any mistake, please leave a comment and I will address it.
APIs covered
1. CompareGreaterThanOrEqual
Vector64<byte> CompareGreaterThanOrEqual(Vector64<byte> left, Vector64<byte> right)
This method compares corresponding vector elements in the left
and right
vector, and if the left
’s value is greater than or equal to the right
’s value sets every bit of the corresponding vector element in the result vector to one, otherwise sets every bit of the corresponding vector element in the result vector to zero and return the result vector.
private Vector64<byte> CompareGreaterThanOrEqualTest(Vector64<byte> left, Vector64<byte> right)
{
return AdvSimd.CompareGreaterThanOrEqual(left, right);
}
// left = <31, 22, 33, 34, 15, 16, 17, 18>
// right = <21, 22, 23, 24, 25, 26, 27, 28>
// Result = <255, 255, 255, 255, 0, 0, 0, 0>
Similar APIs that operate on different sizes:
// class System.Runtime.Intrinisics.AdvSimd
Vector64<short> CompareGreaterThanOrEqual(Vector64<short> left, Vector64<short> right)
Vector64<int> CompareGreaterThanOrEqual(Vector64<int> left, Vector64<int> right)
Vector64<sbyte> CompareGreaterThanOrEqual(Vector64<sbyte> left, Vector64<sbyte> right)
Vector64<float> CompareGreaterThanOrEqual(Vector64<float> left, Vector64<float> right)
Vector64<ushort> CompareGreaterThanOrEqual(Vector64<ushort> left, Vector64<ushort> right)
Vector64<uint> CompareGreaterThanOrEqual(Vector64<uint> left, Vector64<uint> right)
Vector128<byte> CompareGreaterThanOrEqual(Vector128<byte> left, Vector128<byte> right)
Vector128<short> CompareGreaterThanOrEqual(Vector128<short> left, Vector128<short> right)
Vector128<int> CompareGreaterThanOrEqual(Vector128<int> left, Vector128<int> right)
Vector128<sbyte> CompareGreaterThanOrEqual(Vector128<sbyte> left, Vector128<sbyte> right)
Vector128<float> CompareGreaterThanOrEqual(Vector128<float> left, Vector128<float> right)
Vector128<ushort> CompareGreaterThanOrEqual(Vector128<ushort> left, Vector128<ushort> right)
Vector128<uint> CompareGreaterThanOrEqual(Vector128<uint> left, Vector128<uint> right)
// class System.Runtime.Intrinisics.AdvSimd.Arm64
Vector128<double> CompareGreaterThanOrEqual(Vector128<double> left, Vector128<double> right)
Vector128<long> CompareGreaterThanOrEqual(Vector128<long> left, Vector128<long> right)
Vector128<ulong> CompareGreaterThanOrEqual(Vector128<ulong> left, Vector128<ulong> right)
See Microsoft docs here and here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:CompareGreaterThanOrEqualTest(System.Runtime.Intrinsics.Vector64`1[Byte],System.Runtime.Intrinsics.Vector64`1[Byte]):System.Runtime.Intrinsics.Vector64`1[Byte]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
; V01 arg1 [V01,T01] ( 3, 3 ) simd8 -> d1 HFA(simd8)
;# V02 OutArgs [V02 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
cmhs v16.8b, v0.8b, v1.8b
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
2. CompareGreaterThanOrEqualScalar
Vector64<double> CompareGreaterThanOrEqualScalar(Vector64<double> left, Vector64<double> right)
This method compares corresponding vector elements in the left
and right
vector, and if the left
’s value is greater than or equal to the right
’s value sets every bit of the corresponding vector element in the result vector to one, otherwise sets every bit of the corresponding vector element in the result vector to zero and return the result vector.
private Vector64<double> CompareGreaterThanOrEqualScalarTest(Vector64<double> left, Vector64<double> right)
{
return AdvSimd.Arm64.CompareGreaterThanOrEqualScalar(left, right);
}
// left = <11.5>
// right = <11.5>
// Result = <NaN>
Similar APIs that operate on different sizes:
// class System.Runtime.Intrinisics.AdvSimd.Arm64
Vector64<long> CompareGreaterThanOrEqualScalar(Vector64<long> left, Vector64<long> right)
Vector64<float> CompareGreaterThanOrEqualScalar(Vector64<float> left, Vector64<float> right)
Vector64<ulong> CompareGreaterThanOrEqualScalar(Vector64<ulong> left, Vector64<ulong> right)
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:CompareGreaterThanOrEqualScalarTest(System.Runtime.Intrinsics.Vector64`1[Double],System.Runtime.Intrinsics.Vector64`1[Double]):System.Runtime.Intrinsics.Vector64`1[Double]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
; V01 arg1 [V01,T01] ( 3, 3 ) simd8 -> d1 HFA(simd8)
;# V02 OutArgs [V02 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcmge d16, d0, d1
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
3. CompareGreaterThanScalar
Vector64<double> CompareGreaterThanScalar(Vector64<double> left, Vector64<double> right)
This method compares corresponding vector elements in the left
and right
vector, and if the left
’s value is greater than the right
’s value sets every bit of the corresponding vector element in the result vector to one, otherwise sets every bit of the corresponding vector element in the result vector to zero and return the result vector.
private Vector64<double> CompareGreaterThanScalarTest(Vector64<double> left, Vector64<double> right)
{
return AdvSimd.Arm64.CompareGreaterThanScalar(left, right);
}
// left = <11.5>
// right = <11.5>
// Result = <0>
Similar APIs that operate on different sizes:
// class System.Runtime.Intrinisics.AdvSimd.Arm64
Vector64<long> CompareGreaterThanScalar(Vector64<long> left, Vector64<long> right)
Vector64<float> CompareGreaterThanScalar(Vector64<float> left, Vector64<float> right)
Vector64<ulong> CompareGreaterThanScalar(Vector64<ulong> left, Vector64<ulong> right)
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:CompareGreaterThanScalarTest(System.Runtime.Intrinsics.Vector64`1[Double],System.Runtime.Intrinsics.Vector64`1[Double]):System.Runtime.Intrinsics.Vector64`1[Double]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
; V01 arg1 [V01,T01] ( 3, 3 ) simd8 -> d1 HFA(simd8)
;# V02 OutArgs [V02 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcmgt d16, d0, d1
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
4. CompareLessThan
Vector64<byte> CompareLessThan(Vector64<byte> left, Vector64<byte> right)
This method compares corresponding vector elements in the left
and right
vector, and if the left
’s value is less than the right
’s value sets every bit of the corresponding vector element in the result vector to one, otherwise sets every bit of the corresponding vector element in the result vector to zero and return the result vector.
private Vector64<byte> CompareLessThanTest(Vector64<byte> left, Vector64<byte> right)
{
return AdvSimd.CompareLessThan(left, right);
}
// left = <11, 12, 13, 14, 15, 16, 17, 18>
// right = <1, 22, 3, 4, 25, 26, 27, 28>
// Result = <0, 255, 0, 0, 255, 255, 255, 255>
Similar APIs that operate on different sizes:
// class System.Runtime.Intrinisics.AdvSimd
Vector64<short> CompareLessThan(Vector64<short> left, Vector64<short> right)
Vector64<int> CompareLessThan(Vector64<int> left, Vector64<int> right)
Vector64<sbyte> CompareLessThan(Vector64<sbyte> left, Vector64<sbyte> right)
Vector64<float> CompareLessThan(Vector64<float> left, Vector64<float> right)
Vector64<ushort> CompareLessThan(Vector64<ushort> left, Vector64<ushort> right)
Vector64<uint> CompareLessThan(Vector64<uint> left, Vector64<uint> right)
Vector128<byte> CompareLessThan(Vector128<byte> left, Vector128<byte> right)
Vector128<short> CompareLessThan(Vector128<short> left, Vector128<short> right)
Vector128<int> CompareLessThan(Vector128<int> left, Vector128<int> right)
Vector128<sbyte> CompareLessThan(Vector128<sbyte> left, Vector128<sbyte> right)
Vector128<float> CompareLessThan(Vector128<float> left, Vector128<float> right)
Vector128<ushort> CompareLessThan(Vector128<ushort> left, Vector128<ushort> right)
Vector128<uint> CompareLessThan(Vector128<uint> left, Vector128<uint> right)
// class System.Runtime.Intrinisics.AdvSimd.Arm64
Vector128<double> CompareLessThan(Vector128<double> left, Vector128<double> right)
Vector128<long> CompareLessThan(Vector128<long> left, Vector128<long> right)
Vector128<ulong> CompareLessThan(Vector128<ulong> left, Vector128<ulong> right)
See Microsoft docs here and here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:CompareLessThanTest(System.Runtime.Intrinsics.Vector64`1[Byte],System.Runtime.Intrinsics.Vector64`1[Byte]):System.Runtime.Intrinsics.Vector64`1[Byte]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
; V01 arg1 [V01,T01] ( 3, 3 ) simd8 -> d1 HFA(simd8)
;# V02 OutArgs [V02 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
cmhi v16.8b, v1.8b, v0.8b
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
5. CompareLessThanOrEqual
Vector64<byte> CompareLessThanOrEqual(Vector64<byte> left, Vector64<byte> right)
This method compares corresponding vector elements in the left
and right
vector, and if the left
’s value is less than or equal to the right
’s value sets every bit of the corresponding vector element in the result vector to one, otherwise sets every bit of the corresponding vector element in the result vector to zero and return the result vector.
private Vector64<byte> CompareLessThanOrEqualTest(Vector64<byte> left, Vector64<byte> right)
{
return AdvSimd.CompareLessThanOrEqual(left, right);
}
// left = <11, 12, 13, 14, 15, 16, 17, 18>
// right = <1, 12, 3, 4, 25, 26, 27, 28>
// Result = <0, 0, 0, 0, 255, 255, 255, 255>
Similar APIs that operate on different sizes:
// class System.Runtime.Intrinisics.AdvSimd
Vector64<short> CompareLessThanOrEqual(Vector64<short> left, Vector64<short> right)
Vector64<int> CompareLessThanOrEqual(Vector64<int> left, Vector64<int> right)
Vector64<sbyte> CompareLessThanOrEqual(Vector64<sbyte> left, Vector64<sbyte> right)
Vector64<float> CompareLessThanOrEqual(Vector64<float> left, Vector64<float> right)
Vector64<ushort> CompareLessThanOrEqual(Vector64<ushort> left, Vector64<ushort> right)
Vector64<uint> CompareLessThanOrEqual(Vector64<uint> left, Vector64<uint> right)
Vector128<byte> CompareLessThanOrEqual(Vector128<byte> left, Vector128<byte> right)
Vector128<short> CompareLessThanOrEqual(Vector128<short> left, Vector128<short> right)
Vector128<int> CompareLessThanOrEqual(Vector128<int> left, Vector128<int> right)
Vector128<sbyte> CompareLessThanOrEqual(Vector128<sbyte> left, Vector128<sbyte> right)
Vector128<float> CompareLessThanOrEqual(Vector128<float> left, Vector128<float> right)
Vector128<ushort> CompareLessThanOrEqual(Vector128<ushort> left, Vector128<ushort> right)
Vector128<uint> CompareLessThanOrEqual(Vector128<uint> left, Vector128<uint> right)
// class System.Runtime.Intrinisics.AdvSimd.Arm64
Vector128<double> CompareLessThanOrEqual(Vector128<double> left, Vector128<double> right)
Vector128<long> CompareLessThanOrEqual(Vector128<long> left, Vector128<long> right)
Vector128<ulong> CompareLessThanOrEqual(Vector128<ulong> left, Vector128<ulong> right)
See Microsoft docs here and here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:CompareLessThanOrEqualTest(System.Runtime.Intrinsics.Vector64`1[Byte],System.Runtime.Intrinsics.Vector64`1[Byte]):System.Runtime.Intrinsics.Vector64`1[Byte]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
; V01 arg1 [V01,T01] ( 3, 3 ) simd8 -> d1 HFA(simd8)
;# V02 OutArgs [V02 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
cmhs v16.8b, v1.8b, v0.8b
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
6. CompareLessThanOrEqualScalar
Vector64<double> CompareLessThanOrEqualScalar(Vector64<double> left, Vector64<double> right)
This method compares corresponding vector elements in the left
and right
vector, and if the left
’s value is less than or equal to the right
’s value sets every bit of the corresponding vector element in the result vector to one, otherwise sets every bit of the corresponding vector element in the result vector to zero and return the result vector.
private Vector64<double> CompareLessThanOrEqualScalarTest(Vector64<double> left, Vector64<double> right)
{
return AdvSimd.Arm64.CompareLessThanOrEqualScalar(left, right);
}
// left = <11.5>
// right = <11.5>
// Result = <NaN>
Similar APIs that operate on different sizes:
// class System.Runtime.Intrinisics.AdvSimd.Arm64
Vector64<long> CompareLessThanOrEqualScalar(Vector64<long> left, Vector64<long> right)
Vector64<float> CompareLessThanOrEqualScalar(Vector64<float> left, Vector64<float> right)
Vector64<ulong> CompareLessThanOrEqualScalar(Vector64<ulong> left, Vector64<ulong> right)
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:CompareLessThanOrEqualScalarTest(System.Runtime.Intrinsics.Vector64`1[Double],System.Runtime.Intrinsics.Vector64`1[Double]):System.Runtime.Intrinsics.Vector64`1[Double]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
; V01 arg1 [V01,T01] ( 3, 3 ) simd8 -> d1 HFA(simd8)
;# V02 OutArgs [V02 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcmge d16, d1, d0
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
7. CompareLessThanScalar
Vector64<double> CompareLessThanScalar(Vector64<double> left, Vector64<double> right)
This method compares corresponding vector elements in the left
and right
vector, and if the left
’s value is less than the right
’s value sets every bit of the corresponding vector element in the result vector to one, otherwise sets every bit of the corresponding vector element in the result vector to zero and return the result vector.
private Vector64<double> CompareLessThanScalarTest(Vector64<double> left, Vector64<double> right)
{
return AdvSimd.Arm64.CompareLessThanScalar(left, right);
}
// left = <11.5>
// right = <11.5>
// Result = <0>
Similar APIs that operate on different sizes:
// class System.Runtime.Intrinisics.AdvSimd.Arm64
Vector64<long> CompareLessThanScalar(Vector64<long> left, Vector64<long> right)
Vector64<float> CompareLessThanScalar(Vector64<float> left, Vector64<float> right)
Vector64<ulong> CompareLessThanScalar(Vector64<ulong> left, Vector64<ulong> right)
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:CompareLessThanScalarTest(System.Runtime.Intrinsics.Vector64`1[Double],System.Runtime.Intrinsics.Vector64`1[Double]):System.Runtime.Intrinsics.Vector64`1[Double]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
; V01 arg1 [V01,T01] ( 3, 3 ) simd8 -> d1 HFA(simd8)
;# V02 OutArgs [V02 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcmgt d16, d1, d0
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
8. CompareTest
Vector64<byte> CompareTest(Vector64<byte> left, Vector64<byte> right)
This method performs AND of corresponding vector elements in the left
and right
vector, and if the result is not zero, sets every bit of the corresponding vector element in the result vector to one, otherwise sets every bit of the corresponding vector element in the result vector to zero and return the result vector.
private Vector64<byte> CompareTestTest(Vector64<byte> left, Vector64<byte> right)
{
return AdvSimd.CompareTest(left, right);
}
// left = <11, 12, 13, 14, 15, 16, 17, 18>
// right = <4, 22, 23, 24, 25, 26, 27, 28>
// Result = <0, 255, 255, 255, 255, 255, 255, 255>
Similar APIs that operate on different sizes:
// class System.Runtime.Intrinisics.AdvSimd
Vector64<short> CompareTest(Vector64<short> left, Vector64<short> right)
Vector64<int> CompareTest(Vector64<int> left, Vector64<int> right)
Vector64<sbyte> CompareTest(Vector64<sbyte> left, Vector64<sbyte> right)
Vector64<float> CompareTest(Vector64<float> left, Vector64<float> right)
Vector64<ushort> CompareTest(Vector64<ushort> left, Vector64<ushort> right)
Vector64<uint> CompareTest(Vector64<uint> left, Vector64<uint> right)
Vector128<byte> CompareTest(Vector128<byte> left, Vector128<byte> right)
Vector128<short> CompareTest(Vector128<short> left, Vector128<short> right)
Vector128<int> CompareTest(Vector128<int> left, Vector128<int> right)
Vector128<sbyte> CompareTest(Vector128<sbyte> left, Vector128<sbyte> right)
Vector128<float> CompareTest(Vector128<float> left, Vector128<float> right)
Vector128<ushort> CompareTest(Vector128<ushort> left, Vector128<ushort> right)
Vector128<uint> CompareTest(Vector128<uint> left, Vector128<uint> right)
// class System.Runtime.Intrinisics.AdvSimd.Arm64
Vector128<double> CompareTest(Vector128<double> left, Vector128<double> right)
Vector128<long> CompareTest(Vector128<long> left, Vector128<long> right)
Vector128<ulong> CompareTest(Vector128<ulong> left, Vector128<ulong> right)
See Microsoft docs here and here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:CompareTestTest(System.Runtime.Intrinsics.Vector64`1[Byte],System.Runtime.Intrinsics.Vector64`1[Byte]):System.Runtime.Intrinsics.Vector64`1[Byte]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
; V01 arg1 [V01,T01] ( 3, 3 ) simd8 -> d1 HFA(simd8)
;# V02 OutArgs [V02 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
cmtst v16.8b, v0.8b, v1.8b
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
9. CompareTestScalar
Vector64<double> CompareTestScalar(Vector64<double> left, Vector64<double> right)
This method performs AND of corresponding vector elements in the left
and right
vector, and if the result is not zero, sets every bit of the corresponding vector element in the result vector to one, otherwise sets every bit of the corresponding vector element in the result vector to zero and return the result vector.
private Vector64<double> CompareTestScalarTest(Vector64<double> left, Vector64<double> right)
{
return AdvSimd.Arm64.CompareTestScalar(left, right);
}
// left = <11.5>
// right = <11.5>
// Result = <NaN>
Similar APIs that operate on different sizes:
// class System.Runtime.Intrinisics.AdvSimd.Arm64
Vector64<long> CompareTestScalar(Vector64<long> left, Vector64<long> right)
Vector64<ulong> CompareTestScalar(Vector64<ulong> left, Vector64<ulong> right)
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:CompareTestScalarTest(System.Runtime.Intrinsics.Vector64`1[Double],System.Runtime.Intrinsics.Vector64`1[Double]):System.Runtime.Intrinsics.Vector64`1[Double]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
; V01 arg1 [V01,T01] ( 3, 3 ) simd8 -> d1 HFA(simd8)
;# V02 OutArgs [V02 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
cmtst d16, d0, d1
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
10. ConvertToDouble
Vector128<double> ConvertToDouble(Vector64<float> value)
This method converts each element in a value
vector to double the precision of the input element using the rounding mode that as per ARM docs, is determined by the FPCR, and returns the result vector.
private Vector128<double> ConvertToDoubleTest(Vector64<float> value)
{
return AdvSimd.Arm64.ConvertToDouble(value);
}
// value = <11.5, 12.5>
// Result = <11.5, 12.5>
Similar APIs that operate on different sizes:
// class System.Runtime.Intrinisics.AdvSimd.Arm64
Vector128<double> ConvertToDouble(Vector128<long> value)
Vector128<double> ConvertToDouble(Vector128<ulong> value)
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToDoubleTest(System.Runtime.Intrinsics.Vector64`1[Single]):System.Runtime.Intrinsics.Vector128`1[Double]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtl v16.2d, v0.2s
mov v0.16b, v16.16b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
11. ConvertToDoubleScalar
Vector64<double> ConvertToDoubleScalar(Vector64<long> value)
This method converts each element in a value
vector to double the precision of the input element using the rounding mode that as per ARM docs, is determined by the FPCR, and returns the result vector.
private Vector64<double> ConvertToDoubleScalarTest(Vector64<long> value)
{
return AdvSimd.Arm64.ConvertToDoubleScalar(value);
}
// value = <11>
// Result = <11>
Similar APIs that operate on different sizes:
// class System.Runtime.Intrinisics.AdvSimd.Arm64
Vector64<double> ConvertToDoubleScalar(Vector64<ulong> value)
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToDoubleScalarTest(System.Runtime.Intrinsics.Vector64`1[Int64]):System.Runtime.Intrinsics.Vector64`1[Double]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
scvtf d16, d0
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
12. ConvertToDoubleUpper
Vector128<double> ConvertToDoubleUpper(Vector128<float> value)
This method converts each element in the upper half of value
vector to double the precision of the input element using the rounding mode that as per ARM docs, is determined by the FPCR, and returns the result vector. As seen in below example, the result vector element’s size is double
that is twice as long as that of input parameter’s element size float
.
private Vector128<double> ConvertToDoubleUpperTest(Vector128<float> value)
{
return AdvSimd.Arm64.ConvertToDoubleUpper(value);
}
// value = <11.5, 12.5, 13.5, 14.5>
// Result = <13.5, 14.5>
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToDoubleUpperTest(System.Runtime.Intrinsics.Vector128`1[Single]):System.Runtime.Intrinsics.Vector128`1[Double]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd16 -> d0 HFA(simd16)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtl2 v16.2d, v0.4s
mov v0.16b, v16.16b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
13. ConvertToInt32RoundAwayFromZero
Vector64<int> ConvertToInt32RoundAwayFromZero(Vector64<float> value)
This method converts each element in the value
vector from a floating-point to a signed integer value using the Round to Nearest with Ties to Away rounding mode, stores in the result vector and returns the result vector.
private Vector64<int> ConvertToInt32RoundAwayFromZeroTest(Vector64<float> value)
{
return AdvSimd.ConvertToInt32RoundAwayFromZero(value);
}
// value = <11.5, 12.5>
// Result = <12, 13>
Similar APIs that operate on different sizes:
// class System.Runtime.Intrinisics.AdvSimd
Vector128<int> ConvertToInt32RoundAwayFromZero(Vector128<float> value)
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToInt32RoundAwayFromZeroTest(System.Runtime.Intrinsics.Vector64`1[Single]):System.Runtime.Intrinsics.Vector64`1[Int32]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtas v16.2s, v0.2s
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
14. ConvertToInt32RoundAwayFromZeroScalar
Vector64<int> ConvertToInt32RoundAwayFromZeroScalar(Vector64<float> value)
This method converts each element in the value
vector from a floating-point to a signed integer value using the Round to Nearest with Ties to Away rounding mode, stores in the result vector and returns the result vector.
private Vector64<int> ConvertToInt32RoundAwayFromZeroScalarTest(Vector64<float> value)
{
return AdvSimd.ConvertToInt32RoundAwayFromZeroScalar(value);
}
// value = <11.5, 12.5>
// Result = <12, 0>
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToInt32RoundAwayFromZeroScalarTest(System.Runtime.Intrinsics.Vector64`1[Single]):System.Runtime.Intrinsics.Vector64`1[Int32]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtas s16, s0
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
15. ConvertToInt32RoundToEven
Vector64<int> ConvertToInt32RoundToEven(Vector64<float> value)
This method converts each element in the value
vector from a floating-point to a signed integer value using the Round to Nearest rounding mode, stores in the result vector and returns the result vector.
private Vector64<int> ConvertToInt32RoundToEvenTest(Vector64<float> value)
{
return AdvSimd.ConvertToInt32RoundToEven(value);
}
// value = <11.5, 12.5>
// Result = <12, 12>
Similar APIs that operate on different sizes:
// class System.Runtime.Intrinisics.AdvSimd
Vector128<int> ConvertToInt32RoundToEven(Vector128<float> value)
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToInt32RoundToEvenTest(System.Runtime.Intrinsics.Vector64`1[Single]):System.Runtime.Intrinsics.Vector64`1[Int32]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtns v16.2s, v0.2s
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
16. ConvertToInt32RoundToEvenScalar
Vector64<int> ConvertToInt32RoundToEvenScalar(Vector64<float> value)
This method converts each element in the value
vector from a floating-point to a signed integer value using the Round to Nearest rounding mode, stores in the result vector and returns the result vector.
private Vector64<int> ConvertToInt32RoundToEvenScalarTest(Vector64<float> value)
{
return AdvSimd.ConvertToInt32RoundToEvenScalar(value);
}
// value = <11.5, 12.5>
// Result = <12, 0>
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToInt32RoundToEvenScalarTest(System.Runtime.Intrinsics.Vector64`1[Single]):System.Runtime.Intrinsics.Vector64`1[Int32]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtns s16, s0
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
17. ConvertToInt32RoundToNegativeInfinity
Vector64<int> ConvertToInt32RoundToNegativeInfinity(Vector64<float> value)
This method converts each element in the value
vector from a floating-point to a signed integer value using the Round towards Minus Infinity rounding mode, stores in the result vector and returns the result vector.
private Vector64<int> ConvertToInt32RoundToNegativeInfinityTest(Vector64<float> value)
{
return AdvSimd.ConvertToInt32RoundToNegativeInfinity(value);
}
// value = <11.5, 12.5>
// Result = <11, 12>
Similar APIs that operate on different sizes:
// class System.Runtime.Intrinisics.AdvSimd
Vector128<int> ConvertToInt32RoundToNegativeInfinity(Vector128<float> value)
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToInt32RoundToNegativeInfinityTest(System.Runtime.Intrinsics.Vector64`1[Single]):System.Runtime.Intrinsics.Vector64`1[Int32]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtms v16.2s, v0.2s
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
18. ConvertToInt32RoundToNegativeInfinityScalar
Vector64<int> ConvertToInt32RoundToNegativeInfinityScalar(Vector64<float> value)
This method converts each element in the value
vector from a floating-point to a signed integer value using the Round towards Minus Infinity rounding mode, stores in the result vector and returns the result vector.
private Vector64<int> ConvertToInt32RoundToNegativeInfinityScalarTest(Vector64<float> value)
{
return AdvSimd.ConvertToInt32RoundToNegativeInfinityScalar(value);
}
// value = <11.5, 12.5>
// Result = <11, 0>
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToInt32RoundToNegativeInfinityScalarTest(System.Runtime.Intrinsics.Vector64`1[Single]):System.Runtime.Intrinsics.Vector64`1[Int32]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtms s16, s0
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
19. ConvertToInt32RoundToPositiveInfinity
Vector64<int> ConvertToInt32RoundToPositiveInfinity(Vector64<float> value)
This method converts each element in the value
vector from a floating-point to a signed integer value using the Round towards Plus Infinity rounding mode, stores in the result vector and returns the result vector.
private Vector64<int> ConvertToInt32RoundToPositiveInfinityTest(Vector64<float> value)
{
return AdvSimd.ConvertToInt32RoundToPositiveInfinity(value);
}
// value = <11.5, 12.5>
// Result = <12, 13>
Similar APIs that operate on different sizes:
// class System.Runtime.Intrinisics.AdvSimd
Vector128<int> ConvertToInt32RoundToPositiveInfinity(Vector128<float> value)
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToInt32RoundToPositiveInfinityTest(System.Runtime.Intrinsics.Vector64`1[Single]):System.Runtime.Intrinsics.Vector64`1[Int32]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtps v16.2s, v0.2s
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
20. ConvertToInt32RoundToPositiveInfinityScalar
Vector64<int> ConvertToInt32RoundToPositiveInfinityScalar(Vector64<float> value)
This method converts each element in the value
vector from a floating-point to a signed integer value using the Round towards Plus Infinity rounding mode, stores in the result vector and returns the result vector.
private Vector64<int> ConvertToInt32RoundToPositiveInfinityScalarTest(Vector64<float> value)
{
return AdvSimd.ConvertToInt32RoundToPositiveInfinityScalar(value);
}
// value = <11.5, 12.5>
// Result = <12, 0>
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToInt32RoundToPositiveInfinityScalarTest(System.Runtime.Intrinsics.Vector64`1[Single]):System.Runtime.Intrinsics.Vector64`1[Int32]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtps s16, s0
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
21. ConvertToInt32RoundToZero
Vector64<int> ConvertToInt32RoundToZero(Vector64<float> value)
This method converts each element in the value
vector from a floating-point to a signed integer value using the Round to Nearest with toward zero rounding mode, stores in the result vector and returns the result vector.
private Vector64<int> ConvertToInt32RoundToZeroTest(Vector64<float> value)
{
return AdvSimd.ConvertToInt32RoundToZero(value);
}
// value = <11.5, 12.5>
// Result = <11, 12>
Similar APIs that operate on different sizes:
// class System.Runtime.Intrinisics.AdvSimd
Vector128<int> ConvertToInt32RoundToZero(Vector128<float> value)
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToInt32RoundToZeroTest(System.Runtime.Intrinsics.Vector64`1[Single]):System.Runtime.Intrinsics.Vector64`1[Int32]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtzs v16.2s, v0.2s
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
22. ConvertToInt32RoundToZeroScalar
Vector64<int> ConvertToInt32RoundToZeroScalar(Vector64<float> value)
This method converts each element in the value
vector from a floating-point to a signed integer value using the Round to Nearest with toward zero rounding mode, stores in the result vector and returns the result vector.
private Vector64<int> ConvertToInt32RoundToZeroScalarTest(Vector64<float> value)
{
return AdvSimd.ConvertToInt32RoundToZeroScalar(value);
}
// value = <11.5, 12.5>
// Result = <11, 0>
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToInt32RoundToZeroScalarTest(System.Runtime.Intrinsics.Vector64`1[Single]):System.Runtime.Intrinsics.Vector64`1[Int32]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtzs s16, s0
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
23. ConvertToInt64RoundAwayFromZero
Vector128<long> ConvertToInt64RoundAwayFromZero(Vector128<double> value)
This method converts each element in the value
vector from a floating-point to a 64-bits signed integer value using the Round to Nearest with Ties to Away rounding mode, stores in the result vector and returns the result vector.
private Vector128<long> ConvertToInt64RoundAwayFromZeroTest(Vector128<double> value)
{
return AdvSimd.Arm64.ConvertToInt64RoundAwayFromZero(value);
}
// value = <11.5, 12.5>
// Result = <12, 13>
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToInt64RoundAwayFromZeroTest(System.Runtime.Intrinsics.Vector128`1[Double]):System.Runtime.Intrinsics.Vector128`1[Int64]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd16 -> d0 HFA(simd16)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtas v16.2d, v0.2d
mov v0.16b, v16.16b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
24. ConvertToInt64RoundAwayFromZeroScalar
Vector64<long> ConvertToInt64RoundAwayFromZeroScalar(Vector64<double> value)
This method converts each element in the value
vector from a floating-point to a 64-bits signed integer value using the Round to Nearest with Ties to Away rounding mode, stores in the result vector and returns the result vector.
private Vector64<long> ConvertToInt64RoundAwayFromZeroScalarTest(Vector64<double> value)
{
return AdvSimd.Arm64.ConvertToInt64RoundAwayFromZeroScalar(value);
}
// value = <11.5>
// Result = <12>
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToInt64RoundAwayFromZeroScalarTest(System.Runtime.Intrinsics.Vector64`1[Double]):System.Runtime.Intrinsics.Vector64`1[Int64]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtas d16, d0
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
25. ConvertToInt64RoundToEven
Vector128<long> ConvertToInt64RoundToEven(Vector128<double> value)
This method converts each element in the value
vector from a floating-point to a 64-bits signed integer value using the Round to Nearest rounding mode, stores in the result vector and returns the result vector.
private Vector128<long> ConvertToInt64RoundToEvenTest(Vector128<double> value)
{
return AdvSimd.Arm64.ConvertToInt64RoundToEven(value);
}
// value = <11.5, 12.5>
// Result = <12, 12>
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToInt64RoundToEvenTest(System.Runtime.Intrinsics.Vector128`1[Double]):System.Runtime.Intrinsics.Vector128`1[Int64]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd16 -> d0 HFA(simd16)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtns v16.2d, v0.2d
mov v0.16b, v16.16b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
26. ConvertToInt64RoundToEvenScalar
Vector64<long> ConvertToInt64RoundToEvenScalar(Vector64<double> value)
This method converts each element in the value
vector from a floating-point to a 64-bits signed integer value using the Round to Nearest rounding mode, stores in the result vector and returns the result vector.
private Vector64<long> ConvertToInt64RoundToEvenScalarTest(Vector64<double> value)
{
return AdvSimd.Arm64.ConvertToInt64RoundToEvenScalar(value);
}
// value = <11.5>
// Result = <12>
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToInt64RoundToEvenScalarTest(System.Runtime.Intrinsics.Vector64`1[Double]):System.Runtime.Intrinsics.Vector64`1[Int64]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtns d16, d0
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
27. ConvertToInt64RoundToNegativeInfinity
Vector128<long> ConvertToInt64RoundToNegativeInfinity(Vector128<double> value)
This method converts each element in a vector from a floating-point value to a 64-bits signed integer value using the Round towards Minus Infinity rounding mode, and returns the result.
private Vector128<long> ConvertToInt64RoundToNegativeInfinityTest(Vector128<double> value)
{
return AdvSimd.Arm64.ConvertToInt64RoundToNegativeInfinity(value);
}
// value = <11.5, 12.5>
// Result = <11, 12>
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToInt64RoundToNegativeInfinityTest(System.Runtime.Intrinsics.Vector128`1[Double]):System.Runtime.Intrinsics.Vector128`1[Int64]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd16 -> d0 HFA(simd16)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtms v16.2d, v0.2d
mov v0.16b, v16.16b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
28. ConvertToInt64RoundToNegativeInfinityScalar
Vector64<long> ConvertToInt64RoundToNegativeInfinityScalar(Vector64<double> value)
This method converts each element in a vector from a floating-point value to a 64-bits signed integer value using the Round towards Minus Infinity rounding mode, and returns the result.
private Vector64<long> ConvertToInt64RoundToNegativeInfinityScalarTest(Vector64<double> value)
{
return AdvSimd.Arm64.ConvertToInt64RoundToNegativeInfinityScalar(value);
}
// value = <11.5>
// Result = <11>
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToInt64RoundToNegativeInfinityScalarTest(System.Runtime.Intrinsics.Vector64`1[Double]):System.Runtime.Intrinsics.Vector64`1[Int64]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtms d16, d0
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
29. ConvertToInt64RoundToPositiveInfinity
Vector128<long> ConvertToInt64RoundToPositiveInfinity(Vector128<double> value)
This method converts each element in a vector from a floating-point value to a 64-bits signed integer value using the Round towards Plus Infinity rounding mode, and returns the result.
private Vector128<long> ConvertToInt64RoundToPositiveInfinityTest(Vector128<double> value)
{
return AdvSimd.Arm64.ConvertToInt64RoundToPositiveInfinity(value);
}
// value = <11.5, 12.5>
// Result = <12, 13>
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToInt64RoundToPositiveInfinityTest(System.Runtime.Intrinsics.Vector128`1[Double]):System.Runtime.Intrinsics.Vector128`1[Int64]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd16 -> d0 HFA(simd16)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtps v16.2d, v0.2d
mov v0.16b, v16.16b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
30. ConvertToInt64RoundToPositiveInfinityScalar
Vector64<long> ConvertToInt64RoundToPositiveInfinityScalar(Vector64<double> value)
This method converts each element in a vector from a floating-point value to a 64-bits signed integer value using the Round towards Plus Infinity rounding mode, and returns the result.
private Vector64<long> ConvertToInt64RoundToPositiveInfinityScalarTest(Vector64<double> value)
{
return AdvSimd.Arm64.ConvertToInt64RoundToPositiveInfinityScalar(value);
}
// value = <11.5>
// Result = <12>
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToInt64RoundToPositiveInfinityScalarTest(System.Runtime.Intrinsics.Vector64`1[Double]):System.Runtime.Intrinsics.Vector64`1[Int64]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtps d16, d0
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
31. ConvertToInt64RoundToZero
Vector128<long> ConvertToInt64RoundToZero(Vector128<double> value)
This method converts each element in a vector from a floating-point value to a 64-bits signed integer value using the Round towards Zero rounding mode, and returns the result.
private Vector128<long> ConvertToInt64RoundToZeroTest(Vector128<double> value)
{
return AdvSimd.Arm64.ConvertToInt64RoundToZero(value);
}
// value = <11.5, 12.5>
// Result = <11, 12>
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToInt64RoundToZeroTest(System.Runtime.Intrinsics.Vector128`1[Double]):System.Runtime.Intrinsics.Vector128`1[Int64]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd16 -> d0 HFA(simd16)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtzs v16.2d, v0.2d
mov v0.16b, v16.16b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
32. ConvertToInt64RoundToZeroScalar
Vector64<long> ConvertToInt64RoundToZeroScalar(Vector64<double> value)
This method converts each element in a vector from a floating-point value to a 64-bits signed integer value using the Round towards Zero rounding mode, and returns the result.
private Vector64<long> ConvertToInt64RoundToZeroScalarTest(Vector64<double> value)
{
return AdvSimd.Arm64.ConvertToInt64RoundToZeroScalar(value);
}
// value = <11.5>
// Result = <11>
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToInt64RoundToZeroScalarTest(System.Runtime.Intrinsics.Vector64`1[Double]):System.Runtime.Intrinsics.Vector64`1[Int64]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtzs d16, d0
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
33. ConvertToSingle
Vector64<float> ConvertToSingle(Vector64<int> value)
This method converts each element in a vector from fixed-point to floating-point using the rounding mode that, as per ARM docs, is specified by the FPCR, and returns the result.
private Vector64<float> ConvertToSingleTest(Vector64<int> value)
{
return AdvSimd.ConvertToSingle(value);
}
// value = <11, 12>
// Result = <11, 12>
Similar APIs that operate on different sizes:
// class System.Runtime.Intrinisics.AdvSimd
Vector64<float> ConvertToSingle(Vector64<uint> value)
Vector128<float> ConvertToSingle(Vector128<int> value)
Vector128<float> ConvertToSingle(Vector128<uint> value)
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToSingleTest(System.Runtime.Intrinsics.Vector64`1[Int32]):System.Runtime.Intrinsics.Vector64`1[Single]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
scvtf v16.2s, v0.2s
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
34. ConvertToSingleLower
Vector64<float> ConvertToSingleLower(Vector128<double> value)
This method converts each vector element in the value
vector to half the precision of the source element, stores it in a result vector. As seen below, the result vector element’s size float
is half as long as the input vector element’s size double
. The rounding mode is determined by the FPCR.
private Vector64<float> ConvertToSingleLowerTest(Vector128<double> value)
{
return AdvSimd.Arm64.ConvertToSingleLower(value);
}
// value = <11.5, 12.5>
// Result = <11.5, 12.5>
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToSingleLowerTest(System.Runtime.Intrinsics.Vector128`1[Double]):System.Runtime.Intrinsics.Vector64`1[Single]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd16 -> d0 HFA(simd16)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtn v16.2s, v0.2d
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
35. ConvertToSingleRoundToOddLower
Vector64<float> ConvertToSingleRoundToOddLower(Vector128<double> value)
This method narrows each vector element in the value
vector to half the precision using the Round to Odd rounding mode, and stores the result in result vector. For details see the ARM docs.
private Vector64<float> ConvertToSingleRoundToOddLowerTest(Vector128<double> value)
{
return AdvSimd.Arm64.ConvertToSingleRoundToOddLower(value);
}
// value = <11.5, 12.5>
// Result = <11.5, 12.5>
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToSingleRoundToOddLowerTest(System.Runtime.Intrinsics.Vector128`1[Double]):System.Runtime.Intrinsics.Vector64`1[Single]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd16 -> d0 HFA(simd16)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtxn v16.2s, v0.2d
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
36. ConvertToSingleRoundToOddUpper
Vector128<float> ConvertToSingleRoundToOddUpper(Vector64<float> lower, Vector128<double> value)
This method narrows each vector element in the upper-half of value
vector to half the precision using the Round to Odd rounding mode, and stores the result in the upper half of result vector, lower half being the values from lower
vector. For details see the ARM docs.
private Vector128<float> ConvertToSingleRoundToOddUpperTest(Vector64<float> lower, Vector128<double> value)
{
return AdvSimd.Arm64.ConvertToSingleRoundToOddUpper(lower, value);
}
// lower = <11.5, 12.5>
// value = <11.5, 12.5>
// Result = <11.5, 12.5, 11.5, 12.5>
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToSingleRoundToOddUpperTest(System.Runtime.Intrinsics.Vector64`1[Single],System.Runtime.Intrinsics.Vector128`1[Double]):System.Runtime.Intrinsics.Vector128`1[Single]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
; V01 arg1 [V01,T01] ( 3, 3 ) simd16 -> d1 HFA(simd16)
;# V02 OutArgs [V02 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtxn2 v0.4s, v1.2d
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 20, prolog size 8
37. ConvertToSingleScalar
Vector64<float> ConvertToSingleScalar(Vector64<int> value)
This method converts the value
vector from fixed-point to floating-point using the rounding mode that is specified by the FPCR, and returns the result.
private Vector64<float> ConvertToSingleScalarTest(Vector64<int> value)
{
return AdvSimd.ConvertToSingleScalar(value);
}
// value = <11, 12>
// Result = <11, 0>
Similar APIs that operate on different sizes:
// class System.Runtime.Intrinisics.AdvSimd
Vector64<float> ConvertToSingleScalar(Vector64<uint> value)
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToSingleScalarTest(System.Runtime.Intrinsics.Vector64`1[Int32]):System.Runtime.Intrinsics.Vector64`1[Single]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
scvtf s16, s0
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
38. ConvertToSingleUpper
Vector128<float> ConvertToSingleUpper(Vector64<float> lower, Vector128<double> value)
This method converts each vector element in the upper-half of value
vector to half the precision and stores the result in upper-half of result vector, lower half being the values from lower
vector. As seen in example below, the result vector element’s size float
is half as long as the input vector element’s size double
. The rounding mode, as per ARM docs, is determined by the FPCR.
private Vector128<float> ConvertToSingleUpperTest(Vector64<float> lower, Vector128<double> value)
{
return AdvSimd.Arm64.ConvertToSingleUpper(lower, value);
}
// lower = <5.1, 5.1>
// value = <11.5, 12.5>
// Result = <5.1, 5.1, 11.5, 12.5>
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToSingleUpperTest(System.Runtime.Intrinsics.Vector64`1[Single],System.Runtime.Intrinsics.Vector128`1[Double]):System.Runtime.Intrinsics.Vector128`1[Single]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
; V01 arg1 [V01,T01] ( 3, 3 ) simd16 -> d1 HFA(simd16)
;# V02 OutArgs [V02 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtn2 v0.4s, v1.2d
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 20, prolog size 8
39. ConvertToUInt32RoundAwayFromZero
Vector64<uint> ConvertToUInt32RoundAwayFromZero(Vector64<float> value)
This method converts each element in the value
vector from a floating-point to an unsigned integer value using the Round to Nearest with Ties to Away rounding mode, stores in the result vector and returns the result vector.
private Vector64<uint> ConvertToUInt32RoundAwayFromZeroTest(Vector64<float> value)
{
return AdvSimd.ConvertToUInt32RoundAwayFromZero(value);
}
// value = <11.5, 12.5>
// Result = <12, 13>
Similar APIs that operate on different sizes:
// class System.Runtime.Intrinisics.AdvSimd
Vector128<uint> ConvertToUInt32RoundAwayFromZero(Vector128<float> value)
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToUInt32RoundAwayFromZeroTest(System.Runtime.Intrinsics.Vector64`1[Single]):System.Runtime.Intrinsics.Vector64`1[UInt32]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtau v16.2s, v0.2s
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
40. ConvertToUInt32RoundAwayFromZeroScalar
Vector64<uint> ConvertToUInt32RoundAwayFromZeroScalar(Vector64<float> value)
This method converts each element in the value
vector from a floating-point to an unsigned integer value using the Round to Nearest with Ties to Away rounding mode, stores in the result vector and returns the result vector.
private Vector64<uint> ConvertToUInt32RoundAwayFromZeroScalarTest(Vector64<float> value)
{
return AdvSimd.ConvertToUInt32RoundAwayFromZeroScalar(value);
}
// value = <11.5, 12.5>
// Result = <12, 0>
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToUInt32RoundAwayFromZeroScalarTest(System.Runtime.Intrinsics.Vector64`1[Single]):System.Runtime.Intrinsics.Vector64`1[UInt32]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtau s16, s0
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
41. ConvertToUInt32RoundToEven
Vector64<uint> ConvertToUInt32RoundToEven(Vector64<float> value)
This method converts each element in the value
vector from a floating-point to an unsigned integer value using the Round to Nearest rounding mode, stores in the result vector and returns the result vector.
private Vector64<uint> ConvertToUInt32RoundToEvenTest(Vector64<float> value)
{
return AdvSimd.ConvertToUInt32RoundToEven(value);
}
// value = <11.5, 12.5>
// Result = <12, 12>
Similar APIs that operate on different sizes:
// class System.Runtime.Intrinisics.AdvSimd
Vector128<uint> ConvertToUInt32RoundToEven(Vector128<float> value)
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToUInt32RoundToEvenTest(System.Runtime.Intrinsics.Vector64`1[Single]):System.Runtime.Intrinsics.Vector64`1[UInt32]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtnu v16.2s, v0.2s
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
42. ConvertToUInt32RoundToEvenScalar
Vector64<uint> ConvertToUInt32RoundToEvenScalar(Vector64<float> value)
This method converts each element in the value
vector from a floating-point to an unsigned integer value using the Round to Nearest rounding mode, stores in the result vector and returns the result vector.
private Vector64<uint> ConvertToUInt32RoundToEvenScalarTest(Vector64<float> value)
{
return AdvSimd.ConvertToUInt32RoundToEvenScalar(value);
}
// value = <11.5, 12.5>
// Result = <12, 0>
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToUInt32RoundToEvenScalarTest(System.Runtime.Intrinsics.Vector64`1[Single]):System.Runtime.Intrinsics.Vector64`1[UInt32]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtnu s16, s0
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
43. ConvertToUInt32RoundToNegativeInfinity
Vector64<uint> ConvertToUInt32RoundToNegativeInfinity(Vector64<float> value)
This method converts each element in the value
vector from a floating-point to an unsigned integer value using the Round towards Minus Infinity rounding mode, stores in the result vector and returns the result vector.
private Vector64<uint> ConvertToUInt32RoundToNegativeInfinityTest(Vector64<float> value)
{
return AdvSimd.ConvertToUInt32RoundToNegativeInfinity(value);
}
// value = <11.5, 12.5>
// Result = <11, 12>
Similar APIs that operate on different sizes:
// class System.Runtime.Intrinisics.AdvSimd
Vector128<uint> ConvertToUInt32RoundToNegativeInfinity(Vector128<float> value)
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToUInt32RoundToNegativeInfinityTest(System.Runtime.Intrinsics.Vector64`1[Single]):System.Runtime.Intrinsics.Vector64`1[UInt32]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtmu v16.2s, v0.2s
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
44. ConvertToUInt32RoundToNegativeInfinityScalar
Vector64<uint> ConvertToUInt32RoundToNegativeInfinityScalar(Vector64<float> value)
This method converts each element in the value
vector from a floating-point to an unsigned integer value using the Round towards Minus Infinity rounding mode, stores in the result vector and returns the result vector.
private Vector64<uint> ConvertToUInt32RoundToNegativeInfinityScalarTest(Vector64<float> value)
{
return AdvSimd.ConvertToUInt32RoundToNegativeInfinityScalar(value);
}
// value = <11.5, 12.5>
// Result = <11, 0>
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToUInt32RoundToNegativeInfinityScalarTest(System.Runtime.Intrinsics.Vector64`1[Single]):System.Runtime.Intrinsics.Vector64`1[UInt32]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtmu s16, s0
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8
45. ConvertToUInt32RoundToPositiveInfinity
Vector64<uint> ConvertToUInt32RoundToPositiveInfinity(Vector64<float> value)
This method converts each element in the value
vector from a floating-point to an unsigned integer value using the Round towards Plus Infinity rounding mode, stores in the result vector and returns the result vector.
private Vector64<uint> ConvertToUInt32RoundToPositiveInfinityTest(Vector64<float> value)
{
return AdvSimd.ConvertToUInt32RoundToPositiveInfinity(value);
}
// value = <11.5, 12.5>
// Result = <12, 13>
Similar APIs that operate on different sizes:
// class System.Runtime.Intrinisics.AdvSimd
Vector128<uint> ConvertToUInt32RoundToPositiveInfinity(Vector128<float> value)
See Microsoft docs here, ARM docs here.
Assembly generated:
; Assembly listing for method AdvSimdMethods:ConvertToUInt32RoundToPositiveInfinityTest(System.Runtime.Intrinsics.Vector64`1[Single]):System.Runtime.Intrinsics.Vector64`1[UInt32]
;
; V00 arg0 [V00,T00] ( 3, 3 ) simd8 -> d0 HFA(simd8)
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace"
; Lcl frame size = 0
stp fp, lr, [sp,#-16]!
mov fp, sp
fcvtpu v16.2s, v0.2s
mov v0.8b, v16.8b
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code 24, prolog size 8